A New Approach to Protein Structure Mining and Alignment

نویسندگان

  • Hongyuan Li
  • Keith Marsolo
  • Srinivasan Parthasarathy
  • Dmitrii Polshakov
چکیده

One of the largest areas of bioinformatic and data mining research has been in the protein domain. These efforts have included protein structure prediction, folding pathway prediction, sequence alignment, ab initio simulation, structure alignment, substructure detection and many others. Substructure detection is generally defined as the mining of a molecule’s 3D structure in order to find interesting/frequent domains. Sequence alignment involves determining the similarity of two (or more) protein molecules based on the how well their amino acid sequences “match.” There are potential pitfalls when trying solve both of these problems, however. In the case of substructure mining, focusing solely on structural information can lead to the discovery of biologically irrelevant substructures. With sequence alignment, the alignment results can vary greatly, depending on the substitution matrix used. In this paper we describe a method that combines the benefits of both substructure mining and sequence alignment in an attempt to determine the similarity between protein molecules. In the absence of biological information, our work will quickly and efficiently mine a protein molecule in order to determine frequent local structures. With the addition of biological sequence information, however, our algorithm provides a way to align proteins with similar local structures and sequence, yielding a global alignment between molecules. We present a novel structure mining/alignment algorithm as well as some additional work into a new clustering metric for amino acids based on several different physico-chemical properties. This metric is used with our alignment algorithm in order to provide a mechanism for globally aligning protein molecules.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

In Silico Analysis of Primary Sequence and Tertiary Structure of Lepidium Draba Peroxidase

Peroxidase enzymes are vastly applicable in industry and diagnosiss. Recently, we introduced a new kind of peroxidase gene from Lepidium draba (LDP). According to protein multiple sequence alignment results, LDP had 93% similarity and 88.96% identity with horseradish peroxidase C1A (HRP C1A). In the current study we employed in silico tools to determine, to which group of peroxidase enzymes LDP...

متن کامل

Bioinformatics Analysis of Upstream Region and Protein Structure of Fungal Phytase Gene

Phytase increases the bioavailability of phytate phosphorus in seed-based animal feeds and reduces the phosphorus pollution of animal waste. Since most animal feeds for pellets are heated up to 65-80 °C, the production of a thermostable structure for phytase can be useful. In this study, we sought to perform bioinformatics analysis of the upstream region and protein structure of fungal phytase ...

متن کامل

Protein-Protein Interaction Analysis of Common Top Genes in Obsessive-Compulsive disorder (OCD) and Schizophrenia: Towards New Drug Approach

Comorbidty is common among psychiatric disorders including obsessive-compulsive disorder and schizophrenia with a high rate. Many studies suggested that the disorders may have same etiological bases. In this regard, shared pathways of glutamate, dopaminergic, and serotonin are the known ones. Here, the common significant genes are examined to understand the possible molecular origin of the diso...

متن کامل

A new approach for assessing stability of rock slopes considering centroids of weak zones

The intersection lines between discontinuity surfaces and their intersection points on the visible surfaces of any engineering structure may be the instability indicators. This paper describes a new approach to modelling the intersecting lines and points that would provide the first evaluation of any instability in an engineering structure characterized by the failure modes. In this work, the i...

متن کامل

PADS: Protein Structure Alignment Using Directional Shape Signatures

A novel data mining approach for similarity search and knowledge discovery in protein structure databases is proposed. PADS (Protein structure Alignment by Directional shape Signatures) incorporates the three dimensional coordinates of the main atoms of each amino acid and extracts a geometrical shape signature along with the direction of each amino acid. As a result, each protein structure is ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004